Exploring similarity-based classification of larynx disorders from human voice

نویسندگان

  • Evaldas Vaiciukynas
  • Antanas Verikas
  • Adas Gelzinis
  • Marija Bacauskiene
  • Virgilijus Uloza
چکیده

In this paper identification of laryngeal disorders using cepstral parameters of human voice is researched. Mel-frequency cepstral coefficients (MFCCs), extracted from audio recordings of patient’s voice, are further approximated, using various strategies (sampling, averaging, and clustering by Gaussian mixture model). The effectiveness of similarity-based classification techniques in categorizing such pre-processed data into normal voice, nodular, and diffuse vocal fold lesion classes is explored and schemes to combine binary decisions of support vector machines (SVMs) are evaluated. Most practiced RBF kernel was compared to several constructed custom kernels: (i) a sequence kernel, defined over a pair of matrices, rather than over a pair of vectors and calculating the kernelized principal angle (KPA) between subspaces; (ii) a simple supervector kernel using only means of patient’s GMM; (iii) two distance kernels, specifically tailored to exploit covariance matrices of GMM and using the approximation of the Kullback–Leibler divergence from the Monte-Carlo sampling (KL-MCS), and the Kullback–Leibler divergence combined with the Earth mover’s distance (KL-EMD) as similarity metrics. The sequence kernel and the distance kernels both outperformed the popular RBF kernel, but the difference is statistically significant only in the distance kernels case. When tested on voice recordings, collected from 410 subjects (130 normal voice, 140 diffuse, and 140 nodular vocal fold lesions), the KL-MCS kernel, using GMM with full covariance matrices, and the KL-EMD kernel, using GMM with diagonal covariance matrices, provided the best overall performance. In most cases, SVM reached higher accuracy than least squares SVM, except for common binary classification using distance kernels. The results indicate that features, modeled with GMM, and kernel methods, exploiting this information, is an interesting fusion of generative (probabilistic) and discriminative (hyperplane) models for similarity-based classification. 2011 Elsevier B.V. All rights reserved.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Exploring Kernels in Svm-based Classification of Larynx Pathology from Human Voice

In this paper identification of laryngeal disorders using cepstral parameters of human voice is investigated. Mel-frequency cepstral coefficients (MFCC), extracted from audio recordings, are further approximated, using 3 strategies: sampling, averaging, and estimation. SVM and LS-SVM categorize preprocessed data into normal, nodular, and diffuse classes. Since it is a three-class problem, vario...

متن کامل

Comparing the Voice Handicap Index Scores in Groups with Structural and Functional Voice Disorders

Objective: The effects of voice disorders vary from person to person. Occupation, work environment, life, and family reaction are variables that affect one’s perception of his/her own as an impaired voice. Voice Handicap Index (VHI) has not yet been used to compare the degree of voice disorders. Assuming that the quality of life may be different under a variety of voice disorders and that diffe...

متن کامل

Laryngeal Carcinoma in a Pediatric Patient - A Case Report

Introduction:  Carcinoma of the larynx is an extremely uncommon clinical entity in pediatric age. The diagnosis of the laryngeal carcinoma is often delayed due to the low index of suspicion. The factors contributing to delayed diagnosis include the similarity of its symptoms to common benign lesions of the larynx in childhood and difficult examination of the larynx in pediatric patients. ...

متن کامل

Diagnosis of Parkinson’s Disease in Human Using Voice Signals

A full investigation into the features extracted from voice signals of people with and without Parkinson’s disease was performed. A total of 31 people with and without the disease participated in the data collection phase. Their voice signals were recorded and processed. The relevant features were then extracted. A variety of feature selection methods have been utilized resulting in a good perf...

متن کامل

Artificial Neural Networks and Support Vector Machines for Parkinson Disease Detection using Human Voice

Artificial neural network(ANN) with tansig, logsig and purelin transfer function, support vector machines(SVM), linear and quadratic classifiers are used in this work for the detection of Parkinson disease using voice features. In the Parkinson disease, voice of a person changes because of presence of tremor in the voicebox muscles. Total 195 phonations were used for the analysis from twenty th...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Speech Communication

دوره 54  شماره 

صفحات  -

تاریخ انتشار 2012